Install requirements¶
In [ ]:
pip install -r requirements.txt
Import and initiate PeakCalling class¶
In [1]:
from peakcaller import PeakCalling
# Enter the number of reads for each dataset to normalize data
reads_count_1 = 1504149
reads_count_2 = 8991837
# Initiate the PeakCalling class
peak_calling = PeakCalling( \
data_1='./data/coverage_16t5_plus_r209.txt', \
data_2='./data/coverage_185_t5_sorted.txt', \
threshold=0.6, \
window_size=250,
reads_count_1=reads_count_1, \
reads_count_2=reads_count_2 \
)
Find significant changes. Write them to Pandas DataFrame¶
In [2]:
changes = peak_calling.find_significant_coverage_changes()
changes.head(10)
Out[2]:
| Window | Change | Start_Pos | End_Pos | |
|---|---|---|---|---|
| 18 | 18 | 0.912614 | 4501 | 4750 |
| 19 | 19 | 0.846189 | 4751 | 5000 |
| 20 | 20 | 0.663222 | 5001 | 5250 |
| 24 | 24 | 0.613345 | 6001 | 6250 |
| 50 | 50 | 0.789492 | 12501 | 12750 |
| 51 | 51 | 0.816089 | 12751 | 13000 |
| 64 | 64 | 0.896764 | 16001 | 16250 |
| 65 | 65 | 0.832993 | 16251 | 16500 |
| 66 | 66 | 0.641532 | 16501 | 16750 |
| 74 | 74 | 0.654182 | 18501 | 18750 |
Vizualize coverages and significant changes¶
In [3]:
peak_calling.visualize_coverage()
Match significant changes with genome annotation¶
In [4]:
gff_path = 'data/t5.gff3'
peak_calling.compare_coverage_changes_with_annotation(gff_annotation=gff_path)
/opt/anaconda3/envs/test/lib/python3.11/site-packages/genomenotebook/track.py:81: UserWarning: You are trying to plot more than 10^5 glyphs, this might overflow your memory. Consider using bounds or reducing the number of datapoints. /opt/anaconda3/envs/test/lib/python3.11/site-packages/genomenotebook/track.py:81: UserWarning: You are trying to plot more than 10^5 glyphs, this might overflow your memory. Consider using bounds or reducing the number of datapoints.
Green Line: ./data/coverage_16t5_plus_r209.txt Blue Line: ./data/coverage_185_t5_sorted.txt
